5 research outputs found

    Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning for Generating a City-Scale Vehicle Dataset

    Full text link
    Vehicle classification is a hot computer vision topic, with studies ranging from ground-view up to top-view imagery. In remote sensing, the usage of top-view images allows for understanding city patterns, vehicle concentration, traffic management, and others. However, there are some difficulties when aiming for pixel-wise classification: (a) most vehicle classification studies use object detection methods, and most publicly available datasets are designed for this task, (b) creating instance segmentation datasets is laborious, and (c) traditional instance segmentation methods underperform on this task since the objects are small. Thus, the present research objectives are: (1) propose a novel semi-supervised iterative learning approach using GIS software, (2) propose a box-free instance segmentation approach, and (3) provide a city-scale vehicle dataset. The iterative learning procedure considered: (1) label a small number of vehicles, (2) train on those samples, (3) use the model to classify the entire image, (4) convert the image prediction into a polygon shapefile, (5) correct some areas with errors and include them in the training data, and (6) repeat until results are satisfactory. To separate instances, we considered vehicle interior and vehicle borders, and the DL model was the U-net with the Efficient-net-B7 backbone. When removing the borders, the vehicle interior becomes isolated, allowing for unique object identification. To recover the deleted 1-pixel borders, we proposed a simple method to expand each prediction. The results show better pixel-wise metrics when compared to the Mask-RCNN (82% against 67% in IoU). On per-object analysis, the overall accuracy, precision, and recall were greater than 90%. This pipeline applies to any remote sensing target, being very efficient for segmentation and generating datasets.Comment: 38 pages, 10 figures, submitted to journa

    Panoptic Segmentation Meets Remote Sensing

    Full text link
    Panoptic segmentation combines instance and semantic predictions, allowing the detection of "things" and "stuff" simultaneously. Effectively approaching panoptic segmentation in remotely sensed data can be auspicious in many challenging problems since it allows continuous mapping and specific target counting. Several difficulties have prevented the growth of this task in remote sensing: (a) most algorithms are designed for traditional images, (b) image labelling must encompass "things" and "stuff" classes, and (c) the annotation format is complex. Thus, aiming to solve and increase the operability of panoptic segmentation in remote sensing, this study has five objectives: (1) create a novel data preparation pipeline for panoptic segmentation, (2) propose an annotation conversion software to generate panoptic annotations; (3) propose a novel dataset on urban areas, (4) modify the Detectron2 for the task, and (5) evaluate difficulties of this task in the urban setting. We used an aerial image with a 0,24-meter spatial resolution considering 14 classes. Our pipeline considers three image inputs, and the proposed software uses point shapefiles for creating samples in the COCO format. Our study generated 3,400 samples with 512x512 pixel dimensions. We used the Panoptic-FPN with two backbones (ResNet-50 and ResNet-101), and the model evaluation considered semantic instance and panoptic metrics. We obtained 93.9, 47.7, and 64.9 for the mean IoU, box AP, and PQ. Our study presents the first effective pipeline for panoptic segmentation and an extensive database for other researchers to use and deal with other data or related problems requiring a thorough scene understanding.Comment: 40 pages, 10 figures, submitted to journa

    Recognising Three-Dimensional Objects using Parameterized Volumetric Models

    No full text
    Centre for Intelligent Systems and their ApplicationsThis thesis addressed the problem of recognizing 3-D objects, using shape information extracted from range images, and parameterized volumetric models. The domains of the geometric shapes explored is that of complex curved objects with articulated parts, and a great deal of similarity between some of the parts. These objects are exemplified by animal shapes, however the general characteristics and complexity of these shapes are present in a wide range of other natural and man-made objects. In model-based object recognition three main issues constrain the design of a complete solution: representation, feature extraction, and interpretation. this thesis develops an integrated approach that addresses these three issues in the context of the above mentioned domain of objects. For representation I propose a composite description using globally deformable superquadratics and a set of volumetric primitives called geons: this description is shown to have representational and discriminative properties suitable for recognition. Feature extraction comprises a segmentation process which develops a method to extract a parts-based description of the objects as assemblies of defoemable superquadratics. Discontinuity points detected from the images are linked using 'active contour' minimization technique, and deformable superquadratic models are fitted to the resulting regions afterwards. Interpretation is split into three components: classification of parts, matching, and pose estimation. A Radical Basis Function [RBF] classifier algoritm is presented in order to classify the superquadratics shapes derived from the segmentation into one of twelve geon classes. The matching component is decomposed into two stages: first, an indexing scheme which makes effective use of the output of the [RBF] classifier in order to direct the search to the models which contain the parts identified. this makes the search more efficient, and with a model library that is organised in a meaningful and robust way, permits growth without compromising performance. Second, a method is proposed where the hypotheses picked from the index are searched using an Interpretation Tree algorithm combined with a quality measure to evaluate the bindings and the final valid hypotheses based on Possibility Theory, or Theory of Fuzzy Sets. The valid hypotheses ranked by the matching process are then passed to the pose estimation module. This module uses a Kalman Filter technique that includes the constraints on the articulations as perfect measurements, and as such provides a robust and generic way to estimate pose in object domains such as the one approached here. These techniques are then combined to produce an integrated approach to the object recognition task. The thesis develops such an integrated approach, and evaluates its perfomance inthe sample domain. Future extensions of each technique and the overall integration strategy are discussed

    Instance Segmentation for Governmental Inspection of Small Touristic Infrastructure in Beach Zones Using Multispectral High-Resolution WorldView-3 Imagery

    No full text
    Misappropriation of public lands is an ongoing government concern. In Brazil, the beach zone is public property, but many private establishments use it for economic purposes, requiring constant inspection. Among the undue targets, the individual mapping of straw beach umbrellas (SBUs) attached to the sand is a great challenge due to their small size, high presence, and agglutinated appearance. This study aims to automatically detect and count SBUs on public beaches using high-resolution images and instance segmentation, obtaining pixel-wise semantic information and individual object detection. This study is the first instance segmentation application on coastal areas and the first using WorldView-3 (WV-3) images. We used the Mask-RCNN with some modifications: (a) multispectral input for the WorldView3 imagery (eight channels), (b) improved the sliding window algorithm for large image classification, and (c) comparison of different image resizing ratios to improve small object detection since the SBUs are small objects (2 pixels) even using high-resolution images (31 cm). The accuracy analysis used standard COCO metrics considering the original image and three scale ratios (2×, 4×, and 8× resolution increase). The average precision (AP) results increased proportionally to the image resolution: 30.49% (original image), 48.24% (2×), 53.45% (4×), and 58.11% (8×). The 8× model presented 94% AP50, classifying nearly all SBUs correctly. Moreover, the improved sliding window approach enables the classification of large areas providing automatic counting and estimating the size of the objects, proving to be effective for inspecting large coastal areas and providing insightful information for public managers. This remote sensing application impacts the inspection cost, tribute, and environmental conditions

    Instance Segmentation for Governmental Inspection of Small Touristic Infrastructure in Beach Zones Using Multispectral High-Resolution WorldView-3 Imagery

    No full text
    Misappropriation of public lands is an ongoing government concern. In Brazil, the beach zone is public property, but many private establishments use it for economic purposes, requiring constant inspection. Among the undue targets, the individual mapping of straw beach umbrellas (SBUs) attached to the sand is a great challenge due to their small size, high presence, and agglutinated appearance. This study aims to automatically detect and count SBUs on public beaches using high-resolution images and instance segmentation, obtaining pixel-wise semantic information and individual object detection. This study is the first instance segmentation application on coastal areas and the first using WorldView-3 (WV-3) images. We used the Mask-RCNN with some modifications: (a) multispectral input for the WorldView3 imagery (eight channels), (b) improved the sliding window algorithm for large image classification, and (c) comparison of different image resizing ratios to improve small object detection since the SBUs are small objects (<322 pixels) even using high-resolution images (31 cm). The accuracy analysis used standard COCO metrics considering the original image and three scale ratios (2×, 4×, and 8× resolution increase). The average precision (AP) results increased proportionally to the image resolution: 30.49% (original image), 48.24% (2×), 53.45% (4×), and 58.11% (8×). The 8× model presented 94% AP50, classifying nearly all SBUs correctly. Moreover, the improved sliding window approach enables the classification of large areas providing automatic counting and estimating the size of the objects, proving to be effective for inspecting large coastal areas and providing insightful information for public managers. This remote sensing application impacts the inspection cost, tribute, and environmental conditions
    corecore